Bike Ridership EDA
Read in Datasets
- GeoJSON for the bikelane data
- Capital Bikeshare dataset for March
EDA using Altair
First, let’s look at the top bike stations in D.C.
DataTransformerRegistry.enable('default')
The table below shows the most common places people begin their trips using bikeshare.
Code
# Create groupby of top 15 most popular bike stations
grouped_df = bikeshare_df.groupby('start_station_name').agg({'ride_id': 'count'}).reset_index()
grouped_df.rename(columns={'ride_id': 'count_rides'}, inplace = True)
# Keep only top 15
grouped_df = grouped_df.sort_values('count_rides', ascending = False).head(15)
grouped_df| start_station_name | count_rides | |
|---|---|---|
| 282 | Columbus Circle / Union Station | 3441 |
| 508 | New Hampshire Ave & T St NW | 2819 |
| 434 | Lincoln Memorial | 2767 |
| 61 | 15th & P St NW | 2649 |
| 615 | Smithsonian-National Mall / Jefferson Dr & 12t... | 2574 |
| 395 | Jefferson Dr & 14th St SW | 2371 |
| 178 | 5th & K St NW | 2350 |
| 396 | Jefferson Memorial | 2335 |
| 49 | 14th & V St NW | 2305 |
| 105 | 1st & M St NE | 2249 |
| 78 | 17th St & Independence Ave SW | 2144 |
| 174 | 4th St & Madison Dr NW | 2084 |
| 324 | Eastern Market Metro / Pennsylvania Ave & 8th ... | 2064 |
| 389 | Henry Bacon Dr & Lincoln Memorial Circle NW | 1984 |
| 459 | Massachusetts Ave & Dupont Circle NW | 1902 |
We can show this visually using a bar graph in Altair
Code
# Create selection
selection = alt.selection_single(fields=['start_station_name'],name='Random')
color = alt.condition(selection,
alt.Color('start_station_name:N', scale= alt.Scale(scheme="accent"), title = "Station Name"),
alt.value('lightgray'))
# Make bar graph
bar=(alt.Chart(bikeshare_df[bikeshare_df['start_station_name'].isin(grouped_df['start_station_name'].to_list())])
.mark_bar()
.encode(y='count(ride_id):Q',
x=alt.X('start_station_name:N',
sort=alt.EncodingSortField(field='ride_id', op='count',
order='descending')),
color=color,
tooltip=['start_station_name:N', 'count(ride_id):Q']
)
).add_selection(selection)
bar.title ="Top 15 Capital Bikeshare Stations"
bar.encoding.x.title = 'Station'
bar.encoding.y.title = 'Count of Rides in March 2023'
barC:\Users\madel\Documents\Anaconda\envs\anly503\lib\site-packages\altair\utils\core.py:317: FutureWarning:
iteritems is deprecated and will be removed in a future version. Use .items instead.